LIGA and Syllabification Approach for Language Identification and Back Transliteration : Shared Task Report by DAIICT
نویسندگان
چکیده
This paper aims to address the solution for the Subtask 1 of Shared Task on transliterated search,a task in FIRE ’14. The task addresses the problem of data containing English words and transliterated words of Indian languages in English.The task calls for language identification and subsequent back transliteration into the native Indian scripts.The system proposed herewith implements Language Identification Graph Approach to label the words with their language markers and Rule based transliteration using syllabification to obtain back-transliterated word in its native script.Results obtained for Gujarati are as follows: labelling accuracy-0.963 and f measure back transliteration-0.463. Results obtained for Hindi are as follows: labelling accuracy-0.771 and f measure for back transliteration-0.163
منابع مشابه
Mixed Script Ad hoc Retrieval using back transliteration and phrase matching through bigram indexing: Shared Task report by BIT, Mesra
This paper describes an approach for Mixed-script Ad hoc retrieval, a subtask as part of FIRE 2015 Shared Task on Mixed Script Information Retrieval. We participated in subtask 2 of the shared task, where a statistical model was used to carry out back transliteration to Devanagari script. To perform the search, bigram based index of the documents were used and search was performed using pivot t...
متن کاملMachine Learning Approach for Language Identification & Transliteration: Shared Task Report of IITP-TS
In this paper, we describe the system that we developed as part of our participation to the FIRE-2014 Shared Task on Transliterated Search. We participated only for Subtask 1 that focused on labeling the query words. The entire process consists of the following subtasks: language identification of each word in the text, named entity recognition and classification (NERC) and transliteration of t...
متن کاملA Hybrid Approach of English- Hindi Named-entity Transliteration
In recent years, machine transliteration has gained a center of attention for research. Both machine translation and transliteration are important for e-governance and web based online multilingual applications. As machine translation translate source language to target language which results in wrong translation for named entities. Named entities are required to be translated with preserving t...
متن کاملTransliterated Search using Syllabification Approach
Machine transliteration refers to the process of automatic conversion of a word from one language to another without losing its phonological characteristics. In this work, we present our experiments performed in subtask-1 and subtask-2 as a part of the FIRE-2013 transliterated search task. In both the subtasks, the transliteration from Roman script to Devanagari script was performed using sylla...
متن کاملMachine Transliteration using Target-Language Grapheme and Phoneme: Multi-engine Transliteration Approach
This paper describes our approach to “NEWS 2009 Machine Transliteration Shared Task.” We built multiple transliteration engines based on different combinations of two transliteration models and three machine learning algorithms. Then, the outputs from these transliteration engines were combined using re-ranking functions. Our method was applied to all language pairs in “NEWS 2009 Machine Transl...
متن کامل